Numerical Solution to the Performability of a Multiprocessor System with Reconfiguration and Rebooting Delays

نویسندگان

  • Orhan Gemikonakli
  • Tien Van Do
چکیده

Multiprocessor system models are extensively used in modelling transaction processing systems, nodes in communication networks, and flexible machine shops with groups of machines. Such systems clearly, are prone to break-downs. Even if cover is provided with some probability c, there will be rebooting and/or reconfiguration delays to resume operation following the break-down of a processor. In this paper, the performance modelling of a multiprocessor system, with identical processors, serving a stream of arriving jobs is considered. To account for delays due to reconfiguration and rebooting, such systems are modelled and solved for exact performability measures for both bounded and unbounded queuing systems. INTRODUCTION Multiserver system models are useful to model multiprocessor systems (Trivedi 2002; Harrison and Patel 1993), nodes in communication networks, and flexible machine shops (Stecke and Kim 1989; Stecke 1992; Righter 1996; Buzacott and Shantikumar 1993; Fiems et al. 2004) in a manufacturing environment. In this paper we develop approaches to model homogeneous multiprocessor systems with reconfiguration and rebooting delays by suitably extending the resulting quasi birth death (QBD) process in the performance models of multiprocessor systems with breakdowns and repair strategies (Chakka and Mitrani 1992; Chakka et al. 2002). This was considered in (Trivedi and Sathaye 1990) and an approximate performance model based on Markov reward models was presented. In this paper, we derive an exact solution for the steady state probabilities of the same problem using the spectral expansion method. The effects of reconfiguration and rebooting delays are analysed. The paper is organised as follows. The next section presents the homogeneous multiprocessor system with breakdowns and repairs considered in this work, and models the system as a QBD process. The section on modelling reconfiguration and rebooting delays in multiprocessor systems deals with a homogeneous multiprocessor system with breakdowns, repairs, and with reconfiguration and rebooting delays (Trivedi and Sathaye 1990). Exact solution for steady state performability for is derived using the spectral expansion method in the section on steady state solution. The model considered is very useful in the computer industry. Exact solution to this model and numerical results are also presented for both unbounded and bounded systems. MULTIPROCESSOR SYSTEM WITH IDENTICAL PROCESSORS The homogeneous multiprocessor system, shown in Figure 1, consists of K identical parallel processors, numbered 1, 2, ..., K, with a common queue. The queue is of capacity L (finite or infinite L K), including the jobs in service. Jobs arrive into the system in a Poisson stream at rate , and join the queue. Jobs are homogeneous and the service rates of the processors assumed identical. Thus, the service times of jobs serviced by processor k (k=1, 2, ..., K) are distributed exponentially with mean 1/ . However, processor k executes jobs only during its operative periods (during an operative period the processor is capable of its intended operation, whether working or idle), which are distributed exponentially with mean 1/ (equivalent to a constant failure rate of when operative). At the end of an operative period, processor k breaks down and requires an exponentially distributed repair time with mean 1/ . The number of repairs that may proceed in parallel could be restricted. This is expressed by saying that there are R repairmen (R K), each of whom can work on at most one repair at a time. Thus, an inoperative period of a processor would also include the possible waiting for a repairman. No operative processor can be idle if there are jobs awaiting service, and no repairman can be idle if there are broken-down processors waiting for repair. All inter-arrival, service, reconfiguration, rebooting, operative and repair time random variables are independent of each other. The reconfiguration delay 1/ and the rebooting delay 1/ relate to the system and not to individual processors. This system was analysed for exact performability (Chakka and Mitrani 1994; Chakka 1995), for single repairman (R=1) and L and for some repair strategies but reconfiguration and rebooting delays were not considered. MODELLING RECONFIGURATION AND REBOOTING DELAYS IN MULTIPROCESSOR SYSTEMS In multiprocessor systems, in practice however, some delay is encountered when a failed processor is being mapped out of the system (reconfiguration/rebooting delay), and when a repaired processor is being admitted into the system. It is possible to model the system affected by such reconfiguration and rebooting delays effectively using the spectral expansion method. Consider the homogeneous multiprocessor system with K processors, given in Figure 1. and are the service and failure rates of each of the processors. There is a single repair facility (i.e. R=1) with repair rate When a processor fails the fault is covered with probability c and is not covered with probability 1-c. Subsequent to a covered fault, the system comes up in a degraded mode after a brief reconfiguration delay, while after an uncovered fault a longer reboot action is required to bring the system up at a degraded mode. Here, degraded mode indicates a state with one less operative processor than the previous state. For reconfiguration/rebooting period, the system is assumed to be down. Figure 1: A Homogeneous Multiprocessor System with Breakdowns, Repairs, Reconfiguration and Rebooting Delays If there are more operative processors than jobs in the system, then the busy processors are selected randomly. Services that are interrupted by breakdowns are eventually resumed (perhaps on a different processor but at a similar service rate). Similarly, if R < K and the repair strategy allows preemptions of repairs, then interrupted repairs are eventually resumed from the point of interruption and there are no switching delays. The reconfiguration and rebooting times are exponentially distributed with mean 1/ and 1/ respectively. The queuing capacity is L, where L can be finite or infinite. is the arrival rate of jobs. An approximate performance modelling of this system was carried out in (Trivedi and Sathaye 1990). We intend to carry out an exact performance evaluation of this system. Figure 2 is the Markov chain that represents the operative states of the multiprocessor system. The state of the system at time t can be described by a pair of integer valued random variables, I(t) and J(t) specifying the processor configuration (can also be termed, operative state of the multiprocessor system) and the number of jobs present, respectively. Here, the precise meaning of processor configuration, and hence the range of values of I(t), mean the number of operational processors and associated reconfiguration/rebooting delay when appropriate. In general, let’s assume that there are N+1 processor configurations, (operative states of the multiprocessor) represented by the values I(t) = 0, 1, ... , N. These N+1 configurations are the operative states of the model. The model assumptions are assumed to ensure that I(t), t 0, is an irreducible Markov process. J(t) is the total number of jobs in the system at time t, including the ones in service. Then, Z = {[I(t), J(t)]; t 0} is an irreducible Markov process on a lattice strip (a QBD process), that models the system. Its state space is, {0, 1, ..., N} x {0,1, ..., L}. Figure 2: A Homogeneous Multiprocessor System with Breakdowns, Repairs, Rebooting and Reconfiguration Delays The states labelled 1, 2, ..., K are the K working states of the multiprocessor, with that many number of processors in each state. State 0 means no processor is operational. The K-1 states, labelled as X2, X3, ..., XK, are the states representing the reconfiguration delay. C0 = [0]; The K-1 states labelled as Y2, Y3, ..., YK, are the rebooting delay states. Hence, the total number of operative states is 3K-1. Let these be renumbered as, states 0, 1, ..., K unchanged, states X2, X3, ..., XK as K+1, K+2, ..., 2K-1, and the states Y2, Y3, ..., YK as 2K, 2K+1, ..., 3K-2. Cj = Diag[Min{w(0), j} , Min{w(1), j} , ..., Min{w(3K-2), j} ] for 1 j <K where w(i) is the number of working processors in the operative state i. We define the matrices Q( ) and Q ( ) as before (Chakka 1998). Then, the steady state probabilities, p __ i,j, can again be expressed in a similar manner as shown in (Chakka 1998). From this, the required performability measures such as the steady state probabilities, average number of jobs in the system, utilization of the processors, and mean response time can be obtained exactly following the computational procedure found in (Chakka 1998; Chakka 1995). Using the steady state probabilities, the response time distribution can also be derived. The system now can be represented by a QBD process with finite or infinite state space. The state of the system can be defined by (I(t), J(t)) where I(t) is the operative state and J(t) is the number of jobs in the system. Let the operative states be represented in the horizontal direction and the number of jobs in the vertical direction of a two-dimensional lattice strip. Here A is the matrix of instantaneous transition rates from operative state i to operative state k with zeros on the main diagonal. These are the purely lateral transitions of the model Z. Matrices B and C are transition matrices for one-step upward and one-step downward transitions respectively. When the transition rate matrices depend on j for j M, where M is a threshold having an integer value, the process Z evolves with the following instantaneous transitions: THE STEADY STATE SOLUTION The solution is given for an unbounded queue (i.e. K L < ) as well as a bounded queue (i.e. finite L K). Following the spectral expansion solution, the steadystate probabilities of the system considered can be expressed as: Aj: Purely lateral transition rate, from state (i, j) to state (k, j), (0 (i & k) N; i k; j=0,1, ...L), caused by a change in the operative state (i.e. a break-down followed by reconfiguration or rebooting, and a repair). L j N i j t J i t I P p t j i 0 , 0 ); ) ( , ) ( ( lim , Bj: One-step upward transition rate, from state (i, j) to state (k, j+1), (0 (i & k) N; j=0,1,...L), caused by a job arrival into the queue. Cj: One-step downward transition rate, from state (i, j) to state (k, j-1), (0 (i & k) N; j=1,2, ...L), caused by the departure of a serviced job. where L can be finite or infinite. Let’s define diagonal matrices of size (N+1)x(N+1) as:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance Modelling of Multiprocessor System with one High Performance Server

Multiprocessor system models at present are very important and widely used in modelling transaction processing systems, communication networks, mobile networks, and flexible machine shops with groups of machines. Heterogeneous processors system with one faster main server and several identical servers are studied. In this paper reconfiguration and rebooting delays are considered to study the pe...

متن کامل

A Multiprocessor System with Non-Preemptive Earliest-Deadline-First Scheduling Policy: A Performability Study

This paper introduces an analytical method for approximating the performability of a firm realtime system modeled by a multi-server queue. The service discipline in the queue is earliestdeadline- first (EDF), which is an optimal scheduling algorithm. Real-time jobs with exponentially distributed relative deadlines arrive according to a Poisson process. All jobs have deadlines until the end of s...

متن کامل

Analytical modelling and simulation of small scale, typical and highly available Beowulf clusters with breakdowns and repairs

Beowulf clusters are very popular because of the high computational power they can provide at reasonably low costs. However, the most pressing issues of today’s cluster solutions are the need for high availability and performance. Cluster systems are clearly prone to failures. Even if cover is provided with some probability c, there would be reconfiguration and/ or rebooting delays to resume th...

متن کامل

The Numerical Solution of Some Optimal Control Systems with Constant and Pantograph Delays via Bernstein Polynomials

‎In this paper‎, ‎we present a numerical method based on Bernstein polynomials to solve optimal control systems with constant and pantograph delays‎. ‎Constant or pantograph delays may appear in state-control or both‎. ‎We derive delay operational matrix and pantograph operational matrix for Bernstein polynomials then‎, ‎these are utilized to reduce the solution of optimal control with constant...

متن کامل

Convergence of Numerical Method For the Solution of Nonlinear Delay Volterra Integral ‎Equations‎

‎‎In this paper, Solvability nonlinear Volterra integral equations with general vanishing delays is stated. So far sinc methods for approximating the solutions of Volterra integral equations have received considerable attention mainly due to their high accuracy. These approximations converge rapidly to the exact solutions as number sinc points increases. Here the numerical solution of nonlinear...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005